Maximizing Roster Efficiency in the MLS

Authors

Oliver Daboo

Otis Birnbaum

Published

July 25, 2025

Introduction

Data

Figure 1. Scatter plot showing each team’s in the 2024 seasons and their average salary spend across their roster compared to their xG difference. The dashed lines show the means of each variable, and are used to group the teams into four quadrants. The blue line is the linear regression line for the relationship, and shows there is no correlation between team spending and performance.

Figure 2. ECDF Plot showing how the roster spread of Inter Miami in 2024 was very top-heavy

Figure 3. ECDF Plot showing how the roster spread of New York Red Bulls in 2021 was relatively even.

Figure 4. Histogram of salaries with $2 million reference line showing where we cut off observations.

Figure 5. American Soccer Analysis’ Explanation on their G+ Metric
The Top 5 MLS Players 2021–2024 Based on G+ Per 90 min.
Player, Year Club Position Age Nationality G+ Per 90 min.
Lionel Messi, 2024 Inter Miami W 37 Argentina 0.572
Cucho Hernandez, 2024 Columbus Crew ST 25 Colombia 0.471
Adam Buksa, 2021 New England Revolution ST 25 Poland 0.455
Riqui Puig, 2024 LA Galaxy CM 25 Spain 0.455
Cucho Hernandez, 2023 Columbus Crew ST 24 Colombia 0.451

Table 1. The best MLS players from 2021-2024 based on their goals added per 90 minutes played. This top 5 proves that goals added is a valuable metric.

The Top 5 Most Efficient Player Seasons in the MLS in the last 4 Years
Player, Year Club Position Age Nationality GA_Per_90_Per_10k
Patrick Agyemang, 2024 Charlotte FC ST 23 USA 0.0475
Célio Pompeu, 2023 St. Louis City W 23 Brazil 0.0462
Tani Oluwaseyi, 2024 Minnesota United ST 24 Canada 0.0411
Jacob Murrell, 2024 D.C. United ST 20 USA 0.0369
Fredy Montero, 2021 Seattle Sounders ST 34 Colombia 0.0350

Table 2. The most efficient MLS players from 2021-2024 based on their goals added per 90 minutes played per $10k they are paid.

Methods

Results

Figure 6. Shows the relationship between how teams proportionally split up their salary between offensive and dfensive players, and their performance efficiency. Each red dot is a team from the MLS from 2021-2024, and the blue line is the linear regression line.

Figure 7. Generalized Additive Models showing the relationship between a player’s salary and performance, split by their position.

Discussion

Appendix

Table 3. Linear Model Summary Predicting each Team’s xG Difference by their Average Salary
Term Estimate SE t p
(Intercept) 0.90351 7.65199 0.11808 0.90688
avg_guaranteed_compensation 0.00000 0.00001 -0.12225 0.90361
Table 4. Linear Model Summary Predicting each Team’s Points by their Total Goals Added For
Term Estimate SE t p
(Intercept) -24.48032 6.56203 -3.73060 3e-04
total_goals_added_for 1.12311 0.10015 11.21482 0e+00
Table 5. Correlation Between G+ and Points
Variable 1 Variable 2 Correlation
total_goals_added_for points 0.73
Table 6. Linear Model Summary Predicting each Team’s Goals Added Per $10k by their Forward to Defense Spend Ratio
Term Estimate SE t p
(Intercept) 0.04572 0.00239 19.13799 0.00000
fwd_def_spend_ratio -0.00253 0.00095 -2.65173 0.00918
Table 7. Bootstrapped RMSE Summary
Model Mean RMSE SD
enet 9.94 1.26
lasso 9.89 1.24
lm 9.89 1.24
ridge 10.12 1.25
xgb 12.89 1.60

Figure 8. Showing the bootstrapped RMSE distributions for the five different models tested to predict a teams 2024 performance based on their salary spread.
Table 8. Linear Model Summary
Term Estimate SE t p
(Intercept) -17.18 6.59 -2.61 0.01
4-6 0.32 0.39 0.83 0.41
7-9 -0.08 0.87 -0.10 0.92
10-12 2.40 1.32 1.81 0.07
13-15 -0.34 1.60 -0.21 0.83
16-18 -1.78 1.25 -1.42 0.16

Adjusted R-squared: 0.131
Model p-value: 0.006
Table 9. Linear Model Summary Predicting Goals Added Per $10k of Compensation by Position
Position Term Estimate SE t p
AM (Intercept) 0.19307 0.01371 14.08591 0.00000
AM guaranteed_compensation 0.00000 0.00000 2.47279 0.01621
CB (Intercept) 0.13618 0.00393 34.68045 0.00000
CB guaranteed_compensation 0.00000 0.00000 2.45097 0.01472
CM (Intercept) 0.13525 0.00665 20.33777 0.00000
CM guaranteed_compensation 0.00000 0.00000 3.23632 0.00143
DM (Intercept) 0.12874 0.00578 22.26634 0.00000
DM guaranteed_compensation 0.00000 0.00000 4.19331 0.00004
FB (Intercept) 0.11539 0.00402 28.73566 0.00000
FB guaranteed_compensation 0.00000 0.00000 4.66337 0.00000
ST (Intercept) 0.22152 0.00944 23.46965 0.00000
ST guaranteed_compensation 0.00000 0.00000 1.61896 0.10728
W (Intercept) 0.18454 0.00597 30.90898 0.00000
W guaranteed_compensation 0.00000 0.00000 2.77011 0.00608
Table 10. Linear Model Coefficients Predicting Guaranteed Compensation
Term Estimate SE t p 95% CI Low 95% CI High
(Intercept) -556924.45 63942.642 -8.70975 0.00000 -682350.462 -431498.44
age 32072.86 2326.887 13.78359 0.00000 27508.579 36637.14
general_positionCB 90841.52 27886.941 3.25749 0.00115 36140.187 145542.86
general_positionCM 204584.75 33064.797 6.18739 0.00000 139726.844 269442.65
general_positionDM 216203.72 34260.971 6.31050 0.00000 148999.476 283407.97
general_positionST 323932.68 34220.838 9.46595 0.00000 256807.153 391058.20
general_positionW 232919.43 32080.337 7.26050 0.00000 169992.587 295846.28
general_positionAM 354967.26 50307.313 7.05598 0.00000 256287.483 453647.04
region_groupSouth America 179660.95 24445.009 7.34960 0.00000 131711.101 227610.80
region_groupCentral America/Caribbean -36228.70 41423.290 -0.87460 0.38193 -117482.112 45024.72
region_groupEurope 248329.42 25515.469 9.73250 0.00000 198279.820 298379.02
region_groupAfrica 92337.88 39352.968 2.34640 0.01908 15145.482 169530.28
region_groupAsia/Oceania 138940.39 72030.606 1.92891 0.05393 -2350.481 280231.26
Table 11. Model Fit Statistics for Compensation Model
r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
0.25863 0.25272 363048.9 43.80985 0 12 -21609.74 43247.49 43322.06 1.986294e+14 1507 1520
Table 12. Linear Model Coefficients Predicting Guaranteed Compensation per 90 minutes
Term Estimate SE t p 95% CI Low 95% CI High
(Intercept) 0.11462 0.00896 12.78784 0.00000 0.09704 0.13221
age 0.00036 0.00033 1.10531 0.26920 -0.00028 0.00100
general_positionCB 0.01104 0.00391 2.82340 0.00481 0.00337 0.01871
general_positionCM 0.02092 0.00464 4.51241 0.00001 0.01182 0.03001
general_positionDM 0.01349 0.00480 2.80803 0.00505 0.00407 0.02291
general_positionST 0.10028 0.00480 20.90496 0.00000 0.09087 0.10969
general_positionW 0.06296 0.00450 14.00058 0.00000 0.05414 0.07178
general_positionAM 0.08546 0.00705 12.11784 0.00000 0.07162 0.09929
region_groupSouth America 0.02050 0.00343 5.98336 0.00000 0.01378 0.02723
region_groupCentral America/Caribbean 0.01178 0.00581 2.02833 0.04270 0.00039 0.02317
region_groupEurope 0.01319 0.00358 3.68892 0.00023 0.00618 0.02021
region_groupAfrica 0.00045 0.00552 0.08154 0.93502 -0.01037 0.01127
region_groupAsia/Oceania 0.02227 0.01010 2.20572 0.02755 0.00247 0.04208
Table 13. Model Fit Statistics for Predicting Guaranteed Compensation per 90 minutes
r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
0.3408 0.33555 0.05089 64.92593 0 12 2376.352 -4724.704 -4650.134 3.90325 1507 1520
Table 14. Linear Model Coefficients Predicting Guaranteed Compensation per 90 minutes per $10k
Term Estimate SE t p 95% CI Low 95% CI High
(Intercept) 0.01859 0.00092 20.15950 0.00000 0.01678 0.02040
age -0.00045 0.00003 -13.46425 0.00000 -0.00052 -0.00039
general_positionCB 0.00045 0.00040 1.11769 0.26388 -0.00034 0.00124
general_positionCM -0.00063 0.00048 -1.31558 0.18851 -0.00156 0.00031
general_positionDM -0.00047 0.00049 -0.95713 0.33866 -0.00144 0.00050
general_positionST 0.00129 0.00049 2.60438 0.00929 0.00032 0.00225
general_positionW 0.00093 0.00046 2.01465 0.04412 0.00002 0.00184
general_positionAM 0.00029 0.00073 0.40469 0.68577 -0.00113 0.00172
region_groupSouth America -0.00276 0.00035 -7.84113 0.00000 -0.00346 -0.00207
region_groupCentral America/Caribbean 0.00033 0.00060 0.54555 0.58546 -0.00085 0.00150
region_groupEurope -0.00276 0.00037 -7.50566 0.00000 -0.00348 -0.00204
region_groupAfrica -0.00093 0.00057 -1.63059 0.10319 -0.00204 0.00019
region_groupAsia/Oceania -0.00242 0.00104 -2.33306 0.01978 -0.00446 -0.00039
Table 15. Model Fit Statistics for Predicting Guaranteed Compensation per 90 minutes per $10k
r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
0.18316 0.17665 0.00524 28.15897 0 12 5832.871 -11637.74 -11563.17 0.04133 1507 1520

Figure 9. Generalized Additive Model showing a parabolic relationship between a player’s age and efficiency.